Conversation
WalkthroughAdds an evaluation subsystem (EvalAPI, EvalJobScheduler), integrates eval execution into request/session lifecycle and server startup, introduces SHA-256 hash utilities and tests, propagates eval identifiers through prompt metadata, and includes minor infra/logging and .gitignore updates. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Router
participant ContextHandler
participant EvalJobScheduler
participant EvalAPI
participant EvalModule
rect #f0f7ff
Router->>ContextHandler: handle request -> waitUntilAll(sessionId)
ContextHandler->>ContextHandler: await pending promises
end
ContextHandler->>EvalJobScheduler: getJobs(sessionId)
EvalJobScheduler-->>ContextHandler: [PendingEvalJob...]
loop Per job
ContextHandler->>EvalAPI: loadEvalMetadataMap()
EvalAPI-->>ContextHandler: Map<slug,id>
loop Per prompt in job
ContextHandler->>EvalAPI: runEval(evalName,input,output,sessionId,spanId)
EvalAPI->>EvalModule: dynamic import (file URL)
EvalModule-->>EvalAPI: EvalFunction
EvalAPI->>EvalModule: invoke eval with EvalContext
EvalModule-->>EvalAPI: EvalRunResult
EvalAPI->>EvalAPI: POST /evalrun/finish
EvalAPI-->>ContextHandler: CreateEvalRunResponse
end
ContextHandler->>EvalJobScheduler: removeJob(spanId)
end
ContextHandler->>ContextHandler: mark session completed
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Warning Review ran into problems🔥 ProblemsErrors were encountered while retrieving linked issues. Errors (1)
Comment |
jhaynie
left a comment
There was a problem hiding this comment.
need to think about idle detection so we dont suspend process while running evals.
| "esbuild": ">=0.25.0" | ||
| }, | ||
| "dependencies": { | ||
| "@clickhouse/client": "^1.0.0", |
There was a problem hiding this comment.
yeah lol my bad need to remove this
|
|
||
| const url = new URL(req.url); | ||
|
|
||
| // Handle eval routes |
There was a problem hiding this comment.
just make it a normal route handler
There was a problem hiding this comment.
removed the route stuff
There was a problem hiding this comment.
Actionable comments posted: 11
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
src/apis/prompt/index.ts (1)
95-121: Use CommonJS require() for generated module loading (per project rules)This loader uses dynamic ESM import with a file URL. Our guidelines mandate require() + fs.existsSync + try/catch for bundler compatibility and runtime loading.
Apply this diff (switch to createRequire + existsSync, preserve cache‑bust via require.cache):
-import fs from 'fs/promises'; +import fs from 'fs/promises'; +import fsSync from 'fs'; import path from 'path'; -import { pathToFileURL } from 'url'; +import { createRequire } from 'module'; @@ public async loadPrompts(): Promise<void> { @@ - internal.debug('Trying absolute paths:', possiblePaths); - const attemptErrors: string[] = []; - for (const possiblePath of possiblePaths) { - internal.debug(' Checking:', possiblePath); - try { - await fs.access(possiblePath); - internal.debug(' ✓ File exists:', possiblePath); - - // Get file stats for cache-busting - const stats = await fs.stat(possiblePath); - const mtime = stats.mtime.getTime(); - - // Convert to file URL with cache-busting query param - const fileUrl = pathToFileURL(possiblePath).href + `?t=${mtime}`; - internal.debug(' Importing from:', fileUrl); - - // Use ESM dynamic import instead of require - generatedModule = await import(fileUrl); - internal.debug(' ✓ Successfully loaded from:', possiblePath); - break; - } catch (error) { - const errorMsg = - error instanceof Error ? error.message : String(error); - internal.debug(` ✗ Failed to load ${possiblePath}: ${errorMsg}`); - attemptErrors.push(`${possiblePath}: ${errorMsg}`); - } - } + internal.debug('Trying absolute paths:', possiblePaths); + const attemptErrors: string[] = []; + const require = createRequire(import.meta.url); + for (const possiblePath of possiblePaths) { + internal.debug(' Checking:', possiblePath); + try { + if (!fsSync.existsSync(possiblePath)) { + internal.debug(' ✗ File not found:', possiblePath); + attemptErrors.push(`${possiblePath}: not found`); + continue; + } + internal.debug(' ✓ File exists:', possiblePath); + // Cache-bust require on mtime + try { + const stats = await fs.stat(possiblePath); + const resolved = require.resolve(possiblePath); + const cached = require.cache?.[resolved]; + if (cached && (cached as any).loaded) { + delete require.cache[resolved]; + internal.debug(' ↻ Cleared require cache:', resolved, stats.mtimeMs); + } + } catch { + /* ignore cache clear errors */ + } + generatedModule = require(possiblePath); + internal.debug(' ✓ Successfully loaded from:', possiblePath); + break; + } catch (error) { + const errorMsg = + error instanceof Error ? error.message : String(error); + internal.debug(` ✗ Failed to load ${possiblePath}: ${errorMsg}`); + attemptErrors.push(`${possiblePath}: ${errorMsg}`); + } + } @@ - throw new Error( - `Generated prompts file not found. Tried:\n${attemptErrors.join('\n')}` - ); + throw new Error( + `Generated prompts file not found. Tried:\n${attemptErrors.join('\n')}` + );As per coding guidelines / Based on learnings.
src/utils/promptMetadata.ts (1)
18-18: Remove stray expression statement[];This no-op is likely accidental and can confuse readers.
Apply:
-[];
🧹 Nitpick comments (18)
src/server/internal-routes.ts (3)
16-18: Match '/_agentuity' with or without trailing slash and strip query.Current check misses '/_agentuity' and paths with query. Parse pathname first.
- isInternalRoute(url: string): boolean { - return url.startsWith('/_agentuity/'); - } + isInternalRoute(url: string): boolean { + const pathname = url.startsWith('http') + ? new URL(url).pathname + : url.split('?')[0]; + return pathname === '/_agentuity' || pathname.startsWith('/_agentuity/'); + }
31-32: Return structured 404 with content-type.Helps clients and avoids text bodies.
- return new Response('Internal route not found', { status: 404 }); + return new Response(JSON.stringify({ error: 'not_found' }), { + status: 404, + headers: { 'content-type': 'application/json' }, + });
9-11: Keep the logger for future, but store it for consistency.Small polish; no behavior change.
-export class InternalRoutesHandler { - constructor(_logger: Logger) { - // Logger parameter kept for future use - } +export class InternalRoutesHandler { + constructor(private readonly logger: Logger) {}src/apis/evaljobscheduler.ts (3)
30-35: Use slice instead of deprecated substr.Avoid deprecated String.prototype.substr.
- this.instanceId = `EvalJobScheduler-${Date.now()}-${Math.random().toString(36).substr(2, 9)}`; + this.instanceId = `EvalJobScheduler-${Date.now()}-${Math.random() + .toString(36) + .slice(2, 11)}`;
40-60: getInstance doesn’t need to be async.No awaits present; returning Promise adds overhead for callers.
- public static async getInstance(): Promise<EvalJobScheduler> { + public static getInstance(): EvalJobScheduler { // ...unchanged... - return globalThis.__evalJobSchedulerInstance; + return globalThis.__evalJobSchedulerInstance!; }Note: adjust call sites accordingly.
99-104: Consider TTL or max-capacity to prevent unbounded growth.Add eviction (e.g., max jobs per session, time-based cleanup).
src/apis/eval.ts (6)
248-289: Validate and normalize score/pass results before emitting.
- Clamp score to [0,1].
- Keep comment and behavior aligned.
- score: ( - val: number, + score: ( + val: number, meta?: { reasoning?: string; [key: string]: unknown } ) => { + const score = Math.max(0, Math.min(1, val)); createEvalRunRequest = { projectId, sessionId, spanId, result: { success: true, - score: val, + score, metadata: { reason: meta?.reasoning || '', ...meta, }, }, evalId: metadata.id, promptHash, }; },
225-233: Guard missing projectId.If AGENTUITY_CLOUD_PROJECT_ID is unset, fail fast or warn loudly.
- const projectId = process.env.AGENTUITY_CLOUD_PROJECT_ID || ''; + const projectId = process.env.AGENTUITY_CLOUD_PROJECT_ID || ''; + if (!projectId) { + internal.warn('AGENTUITY_CLOUD_PROJECT_ID is not set; eval runs may not persist.'); + }
334-335: Use slice instead of deprecated substr in ID generation.- id: `local-${Date.now()}-${Math.random().toString(36).substr(2, 9)}`, + id: `local-${Date.now()}-${Math.random().toString(36).slice(2, 11)}`,
410-417: Make metadata regex resilient to type annotations.Handles
export const metadata: Type = { ... }.- const metadataRegex = /export\s+const\s+metadata\s*=\s*\{/; + const metadataRegex = + /export\s+const\s+metadata(?:\s*:\s*[^=]+)?\s*=\s*\{/;
445-463: String-munging JSON is brittle.Quotes in values, nested objects, or keys with hyphens will break. Prefer importing the module and reading
metadataor use a tiny parser for the export.Would you like me to replace this with a safe static analysis using a parser (e.g., acorn) or switch to dynamic import for metadata only?
117-152: Sequential dynamic imports may be slow for many evals.Optional: parallelize with Promise.allSettled on filtered files, then pick the first match.
Also applies to: 170-205
src/router/router.ts (1)
428-438: Re-check pending header semantics.Since you now await waitUntilAll, hasPending() should usually be false; consider removing the header or gating wait by config.
Would you like me to add a config flag (e.g., awaitWaitUntilAll: boolean) to preserve previous non-blocking behavior?
src/server/bun.ts (2)
65-67: Safer global leakage for internal loggerAvoid a plain string property on globalThis. Prefer a Symbol or a unique namespace object to reduce collision risk.
Apply this diff:
- (globalThis as any).__agentuityInternalLogger = logger; + const kInternalLogger = Symbol.for('@agentuity/internalLogger'); + (globalThis as any)[kInternalLogger] = logger;
178-191: Internal routes: add CORS, generate runId, and guard body
- Add '*' CORS header for parity with Node handler.
- Pass a generated runId (helps downstream logging/tracing).
- Guard req.body (can be null in Bun).
Apply this diff:
- // Handle internal routes first - if (url.pathname.startsWith('/_agentuity/')) { - const internalResponse = await internalRoutes.handleInternalRoute({ - method, - url: url.pathname, - headers: req.headers.toJSON(), - body: req.body as unknown as ReadableStream<ReadableDataType>, - request: getRequestFromHeaders(req.headers.toJSON(), ''), - setTimeout: (_val: number) => void 0, - }); - if (internalResponse) { - return internalResponse; - } - } + // Handle internal routes first + if (url.pathname.startsWith('/_agentuity/')) { + const runId = + globalThis.crypto?.randomUUID?.() ?? + Date.now().toString(36); + const internalResponse = await internalRoutes.handleInternalRoute({ + method, + url: url.pathname, + headers: req.headers.toJSON(), + body: + (req.body ?? + undefined) as unknown as ReadableStream<ReadableDataType>, + request: getRequestFromHeaders(req.headers.toJSON(), runId), + setTimeout: (_val: number) => void 0, + }); + if (internalResponse) { + const headers = new Headers(internalResponse.headers); + if (!headers.has('Access-Control-Allow-Origin')) { + headers.set('Access-Control-Allow-Origin', '*'); + } + return new Response(internalResponse.body, { + status: internalResponse.status, + headers, + }); + } + }src/index.ts (1)
1-1: ESM path consistency for bundler/NodeNextNew imports/exports should include “.js” to match existing style (e.g., promptMetadata, patchportal) and avoid runtime resolution issues under node16/nodenext.
Apply:
-export * from './apis/eval'; +export * from './apis/eval.js'; @@ -export { hash, hashSync } from './utils/hash'; +export { hash, hashSync } from './utils/hash.js'; @@ -import EvalAPI from './apis/eval'; -import EvalJobScheduler from './apis/evaljobscheduler'; +import EvalAPI from './apis/eval.js'; +import EvalJobScheduler from './apis/evaljobscheduler.js';As per coding guidelines.
Also applies to: 5-5, 16-17
src/utils/promptMetadata.ts (1)
75-83: Unify hashing and fix typos (comiledPrompts) in concat path
- Use hashSync instead of direct crypto for consistency.
- Rename comiledPrompts → compiledPrompts.
- Drop the unused crypto import afterwards.
Apply:
-export async function processPromptMetadataConcat( - comiledPrompts: string[] -): Promise<string> { - const compiledPromptsString = comiledPrompts.join(''); +export async function processPromptMetadataConcat( + compiledPrompts: string[] +): Promise<string> { + const compiledPromptsString = compiledPrompts.join(''); @@ - comiledPrompts.map(async (compiledPrompt) => { + compiledPrompts.map(async (compiledPrompt) => { @@ - const newCompiledHash = crypto - .createHash('sha256') - .update(comiledPrompts.join('')) - .digest('hex'); + const newCompiledHash = hashSync(compiledPrompts.join(''));Additionally remove the top-level crypto import if now unused:
-import crypto from 'crypto';As per coding guidelines.
Also applies to: 97-101
src/router/context.ts (1)
134-139: Validate eval slugs using loaded metadataYou load evalMetadataMap but never use it. Validate slugs to avoid futile runEval calls and clearer logs.
Apply:
@@ - const evalMetadataMap = await evalAPI.loadEvalMetadataMap(); + const evalMetadataMap = await evalAPI.loadEvalMetadataMap(); logger.info(`📚 Loaded ${evalMetadataMap.size} eval mappings`); @@ - for (const evalSlug of promptMeta.evals) { + for (const evalSlug of promptMeta.evals) { + if (!evalMetadataMap.has(evalSlug)) { + logger.warn( + `⏭️ Unknown eval '${evalSlug}' — skipping (not in metadata map)` + ); + continue; + } try {Also applies to: 205-206
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
package-lock.jsonis excluded by!**/package-lock.json
📒 Files selected for processing (15)
.gitignore(1 hunks)package.json(2 hunks)src/apis/api.ts(3 hunks)src/apis/eval.ts(1 hunks)src/apis/evaljobscheduler.ts(1 hunks)src/apis/prompt/index.ts(1 hunks)src/index.ts(3 hunks)src/router/context.ts(2 hunks)src/router/router.ts(1 hunks)src/server/bun.ts(3 hunks)src/server/internal-routes.ts(1 hunks)src/server/node.ts(3 hunks)src/utils/hash.ts(1 hunks)src/utils/promptMetadata.ts(3 hunks)test/utils/hash.test.ts(1 hunks)
🧰 Additional context used
📓 Path-based instructions (10)
{src,test}/**/!(*.d).ts
📄 CodeRabbit inference engine (AGENT.md)
{src,test}/**/!(*.d).ts: Use strict TypeScript and prefer unknown over any
Use ESM import/export syntax; avoid CommonJS require/module.exports
Use relative imports for internal modules
Keep imports organized (sorted, no unused imports)
Use tabs with a visual width of 2 spaces
Limit lines to a maximum of 80 characters
Use single quotes for strings
Use proper Error types; do not throw strings
Prefer template literals over string concatenation
Files:
src/utils/hash.tssrc/router/router.tssrc/apis/prompt/index.tssrc/server/internal-routes.tssrc/utils/promptMetadata.tssrc/apis/eval.tssrc/server/bun.tssrc/apis/evaljobscheduler.tssrc/server/node.tssrc/apis/api.tstest/utils/hash.test.tssrc/index.tssrc/router/context.ts
src/router/**
📄 CodeRabbit inference engine (AGENT.md)
Router code (request/response handling, streaming) resides in src/router/
Files:
src/router/router.tssrc/router/context.ts
src/apis/**
📄 CodeRabbit inference engine (AGENT.md)
Place core API implementations under src/apis/ (email, discord, keyvalue, vector, objectstore)
Files:
src/apis/prompt/index.tssrc/apis/eval.tssrc/apis/evaljobscheduler.tssrc/apis/api.ts
src/apis/**/*.ts
📄 CodeRabbit inference engine (.cursor/rules/code-generation.mdc)
src/apis/**/*.ts: Do not hardcode generated prompt content (e.g., copyWriter) in source files; load it dynamically
Avoid overly complex TypeScript generics for generated content; prefer simple, maintainable types
Maintain type safety for dynamically loaded content by generating and referencing TypeScript definitions, with proper annotations for require() results
Do not use relative imports to generated artifacts; resolve via absolute node_modules paths or the package entry
Generated content is loaded at runtime, not build time; avoid static imports of generated modules
Prefer bracket notation for accessing slug-named properties with hyphens (e.g., prompts['slug-name'])
Avoid relative require('./generated/_index.js'); resolve absolute paths from process.cwd() into node_modules for generated assets
Files:
src/apis/prompt/index.tssrc/apis/eval.tssrc/apis/evaljobscheduler.tssrc/apis/api.ts
src/apis/prompt/index.ts
📄 CodeRabbit inference engine (.cursor/rules/code-generation.mdc)
src/apis/prompt/index.ts: Use dynamic loading with absolute paths; check dist first then src, and provide fallbacks when generated content is missing
Use require() (CommonJS) to load generated modules at runtime rather than import(), ensuring bundler compatibility
When loading generated modules, guard with fs.existsSync and wrap require() in try/catch; default to safe content on failure
Files:
src/apis/prompt/index.ts
src/apis/prompt/**/*.ts
📄 CodeRabbit inference engine (.cursor/rules/code-generation.mdc)
src/apis/prompt/**/*.ts: In TS interfaces that include hyphenated slugs, quote property names (e.g., 'slug-name') and access via bracket notation
Generated content must import from @agentuity/sdk rather than relative paths
Files:
src/apis/prompt/index.ts
src/server/*.ts
📄 CodeRabbit inference engine (.cursor/rules/code-generation.mdc)
src/server/*.ts: Load generated content during server context creation and provide fallbacks when signatures or content are missing
In compile helpers, prefer using generated signature functions when present and provide a manual compilation fallback when absent
Files:
src/server/internal-routes.tssrc/server/bun.tssrc/server/node.ts
src/server/{server,bun,node,agents}.ts
📄 CodeRabbit inference engine (AGENT.md)
Server components live in src/server/ as server.ts, bun.ts, node.ts, and agents.ts
Files:
src/server/bun.tssrc/server/node.ts
test/**
📄 CodeRabbit inference engine (AGENT.md)
Tests must mirror the source structure under the test/ directory
Files:
test/utils/hash.test.ts
src/index.ts
📄 CodeRabbit inference engine (AGENT.md)
src/index.ts must be the entry point and export server, logger, types, and APIs
Always export interpolateTemplate from the main SDK index
Files:
src/index.ts
🧠 Learnings (7)
📚 Learning: 2025-09-24T18:33:34.725Z
Learnt from: jhaynie
PR: agentuity/sdk-js#179
File: src/router/context.ts:36-58
Timestamp: 2025-09-24T18:33:34.725Z
Learning: In the AgentContextWaitUntilHandler class in src/router/context.ts, the running counter should only be decremented in waitUntilAll, not when individual promises complete. The design intent is that running represents work that hasn't been fully processed as a batch, maintaining the collective nature of the wait-until pattern where a session isn't done until all deferred work is processed together.
Applied to files:
src/router/router.ts
📚 Learning: 2025-10-03T22:20:06.957Z
Learnt from: CR
PR: agentuity/sdk-js#0
File: .cursor/rules/code-generation.mdc:0-0
Timestamp: 2025-10-03T22:20:06.957Z
Learning: Applies to src/apis/prompt/index.ts : When loading generated modules, guard with fs.existsSync and wrap require() in try/catch; default to safe content on failure
Applied to files:
src/apis/prompt/index.ts
📚 Learning: 2025-10-03T22:20:06.957Z
Learnt from: CR
PR: agentuity/sdk-js#0
File: .cursor/rules/code-generation.mdc:0-0
Timestamp: 2025-10-03T22:20:06.957Z
Learning: Applies to src/apis/prompt/index.ts : Use dynamic loading with absolute paths; check dist first then src, and provide fallbacks when generated content is missing
Applied to files:
src/apis/prompt/index.ts
📚 Learning: 2025-10-03T22:20:06.957Z
Learnt from: CR
PR: agentuity/sdk-js#0
File: .cursor/rules/code-generation.mdc:0-0
Timestamp: 2025-10-03T22:20:06.957Z
Learning: Applies to src/apis/prompt/index.ts : Use require() (CommonJS) to load generated modules at runtime rather than import(), ensuring bundler compatibility
Applied to files:
src/apis/prompt/index.ts
📚 Learning: 2025-10-03T22:20:06.957Z
Learnt from: CR
PR: agentuity/sdk-js#0
File: .cursor/rules/code-generation.mdc:0-0
Timestamp: 2025-10-03T22:20:06.957Z
Learning: Applies to src/apis/**/*.ts : Do not hardcode generated prompt content (e.g., copyWriter) in source files; load it dynamically
Applied to files:
src/apis/prompt/index.ts
📚 Learning: 2025-10-03T22:20:06.957Z
Learnt from: CR
PR: agentuity/sdk-js#0
File: .cursor/rules/code-generation.mdc:0-0
Timestamp: 2025-10-03T22:20:06.957Z
Learning: Applies to src/apis/prompt/**/*.ts : Generated content must import from agentuity/sdk rather than relative paths
Applied to files:
src/apis/prompt/index.ts
📚 Learning: 2025-10-03T22:20:06.957Z
Learnt from: CR
PR: agentuity/sdk-js#0
File: .cursor/rules/code-generation.mdc:0-0
Timestamp: 2025-10-03T22:20:06.957Z
Learning: Applies to src/apis/**/*.ts : Avoid relative require('./generated/_index.js'); resolve absolute paths from process.cwd() into node_modules for generated assets
Applied to files:
src/apis/prompt/index.ts
🧬 Code graph analysis (11)
src/utils/hash.ts (2)
src/router/data.ts (1)
arrayBuffer(188-191)src/index.ts (2)
hash(5-5)hashSync(5-5)
src/router/router.ts (2)
src/logger/index.ts (1)
logger(3-3)src/logger/user.ts (1)
logger(8-8)
src/apis/prompt/index.ts (1)
src/logger/internal.ts (2)
internal(178-194)error(128-132)
src/server/internal-routes.ts (1)
src/server/types.ts (1)
ServerRequest(23-31)
src/utils/promptMetadata.ts (2)
src/apis/patchportal.ts (1)
PatchPortal(11-125)src/utils/hash.ts (1)
hashSync(27-29)
src/apis/eval.ts (1)
src/apis/api.ts (1)
POST(254-271)
src/server/bun.ts (4)
src/apis/evaljobscheduler.ts (1)
EvalJobScheduler(24-173)src/server/internal-routes.ts (1)
InternalRoutesHandler(8-33)src/types.ts (1)
ReadableDataType(134-139)src/server/util.ts (1)
getRequestFromHeaders(337-360)
src/apis/evaljobscheduler.ts (1)
src/index.ts (2)
EvalJobScheduler(25-25)PromptAttributes(8-8)
src/server/node.ts (3)
src/apis/evaljobscheduler.ts (1)
EvalJobScheduler(24-173)src/server/internal-routes.ts (1)
InternalRoutesHandler(8-33)src/server/util.ts (1)
getRequestFromHeaders(337-360)
test/utils/hash.test.ts (2)
src/index.ts (2)
hash(5-5)hashSync(5-5)src/utils/hash.ts (2)
hash(15-21)hashSync(27-29)
src/router/context.ts (4)
src/apis/session.ts (1)
markSessionCompleted(6-18)src/apis/evaljobscheduler.ts (1)
EvalJobScheduler(24-173)src/apis/eval.ts (1)
EvalAPI(89-465)src/utils/promptMetadata.ts (1)
PromptAttributes(14-17)
🔇 Additional comments (11)
.gitignore (1)
6-7: LGTM!Adding
.DS_Storeto the ignore list is a sensible addition for macOS development environments, preventing system metadata files from being accidentally committed.package.json (1)
82-82: @clickhouse/client is not currently imported or used in the codebase.The verification found no source code imports of
@clickhouse/clientanywhere in the repository. The package appears only inpackage.json(line 82) andpackage-lock.jsonas a direct dependency, but no actual usage exists.The original review concerns about server-side guards, browser bundling impact, and dynamic import patterns are not applicable until the package is actually used in code. If this package was added as a placeholder for future use, import it first; if it was added accidentally, remove it.
No security vulnerabilities were found for the package.
Likely an incorrect or invalid review comment.
src/apis/api.ts (2)
32-37: ServiceName union: looks goodThe union cleanly adds 'eval' and propagates through call sites.
110-113: Eval base URL case: OKResolves AGENTUITY_EVAL_URL with sensible fallback to transport URL.
src/server/bun.ts (1)
68-70: Scheduler init at startup: goodSingleton created early; avoids lazy-init races.
src/server/node.ts (1)
110-114: Scheduler init at startup: goodKeeps singleton alive across requests.
test/utils/hash.test.ts (1)
1-61: Hash tests: solid coverageParities and edge cases covered; assertions are precise.
src/utils/promptMetadata.ts (2)
11-11: Evals field propagation looks goodAdding evals?: string[] to PromptAttributesParams is reasonable and non-breaking.
38-42: Good: centralized hashing via hashSyncUsing hashSync for template/compiled hashes improves consistency and testability.
src/router/context.ts (1)
211-221: Confirm which hash EvalAPI expects (template vs compiled)You pass templateHash to runEval(promptHash?). Should this be compiledHash instead? Verify API semantics for downstream correlation.
Would you like me to scan call sites and API docs to confirm and update accordingly?
src/utils/hash.ts (1)
1-1: Review comment is incorrect; no changes needed.The package description explicitly states this is "The Agentuity SDK for NodeJS and Bun" with Node.js >= 22 as the minimum version. In Node.js 22,
import crypto from 'crypto'correctly exposescrypto.subtle, and the current implementation is valid. The review comment incorrectly assumes browser compatibility is required and suggests unnecessary changes (globalThis fallback) for a server-only SDK. The codebase is correct as-is.Likely an incorrect or invalid review comment.
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (3)
src/apis/evaljobscheduler.ts (2)
119-149: Logging uses debug level appropriately.The method logs job details at
internal.debuglevel throughout, which is appropriate for detailed operational data. This addresses the concern from the previous review about info-level logging.
154-168: printState still dumps full jobs array with potential sensitive data.The method continues to serialize complete job objects without redaction. Prompt metadata may contain sensitive inputs, outputs, or other PII.
Apply the previously suggested fix:
- public printState(): void { - internal.debug('🔍 EvalJobScheduler.printState() called'); - internal.debug('🔍 Instance ID:', this.instanceId); - internal.debug('🔍 EvalJobScheduler State:'); - internal.debug('📊 Total jobs:', this.pendingJobs.size); - internal.debug('📋 Job IDs:', Array.from(this.pendingJobs.keys())); - internal.debug( - '📦 All jobs:', - JSON.stringify(Array.from(this.pendingJobs.values()), null, 2) - ); - internal.debug( - '🔍 Global instance check:', - globalThis.__evalJobSchedulerInstance === this - ); - } + public printState(): void { + internal.debug('🔍 EvalJobScheduler.printState() called'); + internal.debug('🔍 Instance ID:', this.instanceId); + internal.debug('🔍 EvalJobScheduler State:'); + internal.debug('📊 Total jobs:', this.pendingJobs.size); + internal.debug('📋 Job IDs:', Array.from(this.pendingJobs.keys())); + if (process.env.AGENTUITY_DEBUG_EVALS === '1') { + internal.debug( + '📦 All jobs (redacted):', + JSON.stringify( + Array.from(this.pendingJobs.values()).map(j => ({ + spanId: j.spanId, + sessionId: j.sessionId, + promptMetadataCount: j.promptMetadata?.length ?? 0, + hasOutput: !!j.output, + createdAt: j.createdAt, + })), + null, + 2 + ) + ); + } + internal.debug( + '🔍 Global instance check:', + globalThis.__evalJobSchedulerInstance === this + ); + }src/router/context.ts (1)
82-86: Zero-promises branch still missingmarkSessionCompletedcall.The early-return path executes evals but never marks the session completed, leaving the session in an incomplete state from a tracking perspective. This inconsistency with the Promise.all branch (line 96) means metrics and downstream consumers won't see the session end.
Apply this fix:
if (this.promises.length === 0) { internal.debug('No promises to wait for, executing evals directly'); + await markSessionCompleted(sessionId, 0); await this.executeEvalsForSession(logger, sessionId); return; }
🧹 Nitpick comments (2)
src/apis/evaljobscheduler.ts (1)
40-60: Consider makinggetInstance()synchronous.The method is declared
asyncbut performs no asynchronous operations. Unless async initialization is planned for the future, this could be simplified to a synchronous method returningEvalJobSchedulerdirectly.If desired, apply this diff:
- public static async getInstance(): Promise<EvalJobScheduler> { + public static getInstance(): EvalJobScheduler { internal.debug('🔍 EvalJobScheduler.getInstance() called'); internal.debug( '🔍 globalThis.__evalJobSchedulerInstance exists:', !!globalThis.__evalJobSchedulerInstance ); if (!globalThis.__evalJobSchedulerInstance) { globalThis.__evalJobSchedulerInstance = new EvalJobScheduler(); internal.debug( '🆕 Created new EvalJobScheduler instance, ID:', globalThis.__evalJobSchedulerInstance.instanceId ); } else { internal.debug( '♻️ Returning existing EvalJobScheduler instance, ID:', globalThis.__evalJobSchedulerInstance.instanceId ); } return globalThis.__evalJobSchedulerInstance; }src/router/context.ts (1)
112-174: Consider reusingEvalAPIinstance or document stateless design.A new
EvalAPIinstance is created for each session (line 138). IfEvalAPIis stateless and lightweight, this is fine. Otherwise, consider reusing a single instance or caching theevalMetadataMapto avoid repeated file system scans.Also, ensure cleanup failures are non-critical: the
removeJob()loop (lines 167-169) doesn't check return values, which is acceptable if job removal is best-effort.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
src/apis/api.ts(2 hunks)src/apis/eval.ts(1 hunks)src/apis/evaljobscheduler.ts(1 hunks)src/router/context.ts(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- src/apis/eval.ts
- src/apis/api.ts
🧰 Additional context used
📓 Path-based instructions (4)
src/router/**
📄 CodeRabbit inference engine (AGENT.md)
Router code (request/response handling, streaming) resides in src/router/
Files:
src/router/context.ts
{src,test}/**/!(*.d).ts
📄 CodeRabbit inference engine (AGENT.md)
{src,test}/**/!(*.d).ts: Use strict TypeScript and prefer unknown over any
Use ESM import/export syntax; avoid CommonJS require/module.exports
Use relative imports for internal modules
Keep imports organized (sorted, no unused imports)
Use tabs with a visual width of 2 spaces
Limit lines to a maximum of 80 characters
Use single quotes for strings
Use proper Error types; do not throw strings
Prefer template literals over string concatenation
Files:
src/router/context.tssrc/apis/evaljobscheduler.ts
src/apis/**
📄 CodeRabbit inference engine (AGENT.md)
Place core API implementations under src/apis/ (email, discord, keyvalue, vector, objectstore)
Files:
src/apis/evaljobscheduler.ts
src/apis/**/*.ts
📄 CodeRabbit inference engine (.cursor/rules/code-generation.mdc)
src/apis/**/*.ts: Do not hardcode generated prompt content (e.g., copyWriter) in source files; load it dynamically
Avoid overly complex TypeScript generics for generated content; prefer simple, maintainable types
Maintain type safety for dynamically loaded content by generating and referencing TypeScript definitions, with proper annotations for require() results
Do not use relative imports to generated artifacts; resolve via absolute node_modules paths or the package entry
Generated content is loaded at runtime, not build time; avoid static imports of generated modules
Prefer bracket notation for accessing slug-named properties with hyphens (e.g., prompts['slug-name'])
Avoid relative require('./generated/_index.js'); resolve absolute paths from process.cwd() into node_modules for generated assets
Files:
src/apis/evaljobscheduler.ts
🧬 Code graph analysis (1)
src/router/context.ts (4)
src/apis/session.ts (1)
markSessionCompleted(6-18)src/apis/evaljobscheduler.ts (1)
EvalJobScheduler(24-176)src/apis/eval.ts (1)
EvalAPI(89-465)src/utils/promptMetadata.ts (1)
PromptAttributes(14-17)
🔇 Additional comments (4)
src/apis/evaljobscheduler.ts (2)
65-104: LGTM: Job creation logic is sound.The method correctly creates or overwrites jobs, counts evals from metadata, and maintains consistent state. Debug logging provides good observability.
109-114: LGTM: Removal logic is straightforward.The method correctly removes jobs and returns the deletion status.
src/router/context.ts (2)
1-12: LGTM: Import additions are appropriate.The new imports (trace, EvalAPI, EvalJobScheduler, internal logger, PromptAttributes) properly support the eval execution functionality added to this file.
88-107: LGTM: Promise.all branch properly sequences completion and eval execution.The flow correctly waits for all promises, marks the session completed with accurate duration, executes evals post-completion, and handles cleanup.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
src/apis/eval.ts (2)
29-33: Replaceanywithunknownin index signature.This issue was already flagged in a previous review. The index signature should use
unknownto align with strict TypeScript guidelines, and the biome-ignore comment should be removed.
317-331: Improve HTTP status validation to check 2xx range.The current check
if (!resp.status)only verifies that a status exists, not whether it indicates success. A proper HTTP status check should validate the 2xx range (200-299).Apply this diff:
-if (!resp.status) { - internal.debug('Failed to update the database with the eval run', { +if (!resp || typeof resp.status !== 'number' || resp.status < 200 || resp.status >= 300) { + internal.debug('Failed to update the database with the eval run', { status: resp.status, evalName, evalId: r.evalId, sessionId: r.sessionId, }); } else { internal.debug('Eval run updated in the database', { status: resp.status, evalName, evalId: r.evalId, sessionId: r.sessionId, }); }
🧹 Nitpick comments (1)
src/apis/eval.ts (1)
111-158: Consider extracting shared logic to reduce duplication.Both
loadEvalByNameandloadEvalByIdshare nearly identical file-scanning, filtering, and dynamic-import logic. Only the metadata matching criterion differs (slug vs. id). Extracting a private helper method would improve maintainability.Example refactor:
private async findEvalByPredicate( predicate: (metadata: any) => boolean ): Promise<{ evalFn: EvalFunction; metadata: { id: string; slug: string; name: string; description: string }; }> { const files = fs.readdirSync(this.evalsDir); const expectedExt = this.isBundled ? '.js' : '.ts'; for (const file of files) { if (file === 'index.ts' || file === 'index.js') continue; if (!file.endsWith(expectedExt)) continue; const filePath = path.join(this.evalsDir, file); try { const fileUrl = pathToFileURL(filePath).href; const module = await import(fileUrl); if (module.metadata && predicate(module.metadata)) { return { evalFn: module.default, metadata: module.metadata }; } } catch (error) { internal.debug(`Skipping file ${file} due to import error: ${error}`); } } throw new Error('No eval found matching predicate'); } async loadEvalByName(evalName: string) { return this.findEvalByPredicate(m => m.slug === evalName); } async loadEvalById(evalId: string) { return this.findEvalByPredicate(m => m.id === evalId); }Also applies to: 164-211
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
src/apis/eval.ts(1 hunks)src/apis/patchportal.ts(1 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
src/apis/**
📄 CodeRabbit inference engine (AGENT.md)
Place core API implementations under src/apis/ (email, discord, keyvalue, vector, objectstore)
Files:
src/apis/patchportal.tssrc/apis/eval.ts
{src,test}/**/!(*.d).ts
📄 CodeRabbit inference engine (AGENT.md)
{src,test}/**/!(*.d).ts: Use strict TypeScript and prefer unknown over any
Use ESM import/export syntax; avoid CommonJS require/module.exports
Use relative imports for internal modules
Keep imports organized (sorted, no unused imports)
Use tabs with a visual width of 2 spaces
Limit lines to a maximum of 80 characters
Use single quotes for strings
Use proper Error types; do not throw strings
Prefer template literals over string concatenation
Files:
src/apis/patchportal.tssrc/apis/eval.ts
src/apis/**/*.ts
📄 CodeRabbit inference engine (.cursor/rules/code-generation.mdc)
src/apis/**/*.ts: Do not hardcode generated prompt content (e.g., copyWriter) in source files; load it dynamically
Avoid overly complex TypeScript generics for generated content; prefer simple, maintainable types
Maintain type safety for dynamically loaded content by generating and referencing TypeScript definitions, with proper annotations for require() results
Do not use relative imports to generated artifacts; resolve via absolute node_modules paths or the package entry
Generated content is loaded at runtime, not build time; avoid static imports of generated modules
Prefer bracket notation for accessing slug-named properties with hyphens (e.g., prompts['slug-name'])
Avoid relative require('./generated/_index.js'); resolve absolute paths from process.cwd() into node_modules for generated assets
Files:
src/apis/patchportal.tssrc/apis/eval.ts
🧬 Code graph analysis (1)
src/apis/eval.ts (1)
src/apis/api.ts (1)
POST(253-270)
🔇 Additional comments (1)
src/apis/patchportal.ts (1)
17-17: LGTM! Good migration from deprecatedsubstrtosubstring.The behavior is identical—both extract 9 characters starting at position 2—and this change aligns with modern JavaScript standards.
Summary by CodeRabbit
New Features
Bug Fixes
Tests
Chores